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Abstract 

Chromosome inversions have fascinated the scientific community, mainly because of their 
role in the rapid adaption of different taxa to changing environments. However, the 
ecological traits linked to chromosome inversions have been poorly studied. Here, we 
investigated the roles played by 23 chromosome inversions in the adaptation of the four 
major African malaria mosquitoes to local environments in Africa. We studied their 
distribution patterns by using spatially explicit modeling and characterized the 
ecogeographical determinants of each inversion range. We then performed hierarchical 
clustering and constrained ordination analyses to assess the spatial and ecological 
similarities among inversions. Our results show that most inversions are environmentally 
structured, suggesting that they are actively involved in processes of local adaptation. Some 
inversions exhibited similar geographical patterns and ecological requirements among the 
four mosquito species, providing evidence for parallel evolution. Conversely, common 
inversion polymorphisms between sibling species displayed divergent ecological patterns, 
suggesting that they might have a different adaptive role in each species. These results are 
in agreement with the finding that chromosomal inversions play a role in Anopheles 
ecotypic adaptation. This study establishes a strong ecological basis for future genome- 
based analyses to elucidate the genetic mechanisms of local adaptation in these four 
mosquitoes. 
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47 Introduction 

48 Chromosome inversions have been considered by pioneering geneticists as the fingerprints 

49 of evolutionary processes in many different species (Krimbas and Powell 1992; Hoffmann 

50 and Rieseberg 2008). Over the last century, theoretical and experimental works have 

5 1 positioned chromosome inversions as key actors of chromosome architecture, local 

52 adaptation, sex evolution and speciation (Coghlan et al. 2005; Feuk et al. 2005; van Doom 

53 and Kirkpatrick 2007; Bhutkar et al. 2008; Hoffmann and Rieseberg 2008; Sharakhova et 

54 al. 2011). Evidence that natural selection acts on chromosomal inversions have been 

55 gathered in many taxa, from mice (Lyon 2003), humans (Stefansson et al. 2005), 

56 Drosophila (Hoffmann et al. 2004) and monkeyflowers (Lowry and Willis 2010) to 

57 mosquitoes (Fouet et al. 2012; Ayala et al. 2013). Evolutionarily, chromosomal inversions 

58 are important because they reduce and, consequently, substantially alter recombination in 

59 heterozygotes . This and their size (spanning hundreds or even thousands of genes) 

60 facilitate the capture of favourable combinations of locally adapted alleles. Indeed, recent 

61 theoretical models (Kirkpatrick and Barton 2006; Manoukis et al. 2008; Schaeffer 2008) 

62 and many ecological evidences support their role in local adaptation processes (Coluzzi et 

63 al. 1979b; Rodriguez-Trelles et al. 1996; Coluzzi et al. 2002; Balanya et al. 2003; 

64 Hoffmann et al. 2004; Ayala et al. 2011; Ayala et al. 2014). Although, other forces, such as 

65 limited gene flow and genetic drift, could also explain the inversion clines (Dobzhansky 

66 and Wright 1943), the finding that chromosome inversion frequencies change at different 

67 collection sites was the first hint that they could contribute to local adaptation (Dobzhansky 

68 and Sturtevant 1938). Specifically, the relative frequencies of paracentric inversion 

69 polymorphisms (i.e., when a segment of a chromosome arm, which does not include the 

70 centromere, breaks and is reinserted in the reverse orientation) change in concert with the 

7 1 variation of different biotic and non-biotic factors. This feature allowed studying the natural 
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selection forces that drive the evolution of inversions towards fixation (i.e., inversion 
frequency close to 100%) or stable polymorphism (i.e., inversion frequency stable along 
time) (Kirkpatrick and Barton 2006; Feder et al. 2011). However, the molecular basis of the 
adaptive role of these inversions remains barely known. Kirkpatrick and Kem (Kirkpatrick 
and Kern 2012) wrote that “the money is on the inversions” to suggest where to find genes 
involved in local adaptation. Heliconius butterflies offer a compelling example of how an 
inversion captures and protect several adaptive loci linked to mimicry (Joron et al. 2011). In 
Drosophila, the classical animal model for studying chromosome rearrangements, inversion 
frequencies and some genes within the inversion vary along environmental clines (Umina et 
al. 2005; Collinge et al. 2006; Kolaczkowski et al. 2011; Kapun et al. 2016). In Anopheles 
gambiae, comparative genomic studies on carriers of the 2La inversion along an 
environmental aridity gradient led to the identification of divergent genes within the 
inversion (Cheng et al. 2012). These genes mainly encode signalling molecules, gustatory 
receptors or ion-channel genes. Interestingly, a remarkable correspondence between 
orthologues and their functions was observed between Drosophila and An. gambiae in 
comparable environmental clines, providing evidences of potential parallel evolution 
(Kolaczkowski et al. 2011; Cheng et al. 2012). Nevertheless, studies remain scarce. The 
main obstacle to phenotypic experiments lies in the difficulty of identifying the local 
adaptation drivers (ecological, behavioural, sexual, etc.) that model the inversion 
distribution in a population along a cline. For example, in sub-Saharan Africa, the An. 
gambiae 2La inversion frequency varies significantly along an aridity gradient, from 
complete absence in the humid rainforest of Central Africa to fixation in the arid savannas 
of West and East Africa (Coluzzi et al. 1985; Simard et al. 2009). In an attempt to validate 
the hypothesis that aridity tolerance is linked to this inversion, the thermal tolerance and 
desiccation resistance of 2La inversion carriers and non-carriers was tested in laboratory 
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97 conditions. The results confirmed that 2La inversion carriers exhibit a higher desiccation 

98 and thermal resistance (Gray et al. 2009; Rocca et al. 2009; Fouet et al. 2012). Moreover, 

99 genome-wide expression analyses before and after exposure to thermal stress revealed that 

100 a large number of stress-linked genes were up-regulated in larvae carrying the inversion, 

101 compared with non-carriers (Cassone et al. 2011). However, only two of these up-regulated 

102 genes were in common with those found along the latitudinal cline (Cheng et al. 2012). 

103 Unfortunately, this is the only inversions where phenotypic and molecular analysis have 

104 been carried out. Therefore, to understand the genetic mechanism of local adaptation, it is 

105 essential to identify the specific adaptive traits that shaped inversion frequencies in any 

106 species. 

107 

108 The availability of polytene chromosomes in Anopheles is a unique opportunity for 

109 studying chromosome evolution (Green and Hunt 1980; Coluzzi et al. 2002; Sharakhov et 

110 al. 2002; Sharakhova et al. 2006; Sharakhova et al. 2010a; Sharakhova et al. 2010b; 

111 Sharakhova et al. 2011; Sharakhova et al. 2013). In Africa, four Anopheles species have 

1 12 received most attention: An. gambiae, An. coluzzii, An. arabiensis and An. funestus. The 

113 first three species belong to the same complex, although An. gambiae and An. coluzzii have 

1 14 been recently proposed as separate species (Coetzee et al. 2013). Until the development of 

115 molecular diagnostic tools in the 1990s, An. arabiensis was distinguished from the other 

1 16 sibling species in the Anopheles gambiae complex on the basis of post-zygotic barriers and 

117 five fixed inversions on chromosome X (Davidson 1964; Davidson and Hunt 1973; Coluzzi 

118 et al. 2002). These four species can thrive in a wide range of ecological environments and 

1 19 live in sympatry in many sub-Saharan Africa regions (Ayala et al. 2009; Sinka et al. 2012). 

120 Their ecological success and their anthropic habits (resting, feeding and breeding 

121 preferences) make of them the most efficient malaria vectors. Chromosomal 
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rearrangements have been directly involved in Anopheles ecological and behavioural 
plasticity (Ayala et al. 2014). However, little is known about the specific environmental 
requirements of chromosomal inversions and how they promote local adaptation in these 
malaria mosquitoes. 

In this study, we investigated, from a macro-ecological perspective, the roles played by 23 
chromosome inversion polymorphisms in the adaptation of the four major African malaria 
vectors {Anopheles gambiae, An. coluzzii,An. arabiensis and An. funestus) to local 
environments in West-Central Africa. We studied the distribution patterns of such inversion 
polymorphisms by using spatially explicit modelling and characterized the ecogeographical 
determinants of each inversion range. We then performed hierarchical clustering and 
constrained ordination analyses to assess the spatial and ecological similarities of the 23 
inversions. Our results show that most inversions are environmentally structured, 
suggesting that they are actively involved in processes of local adaptation to a wide range 
of habitats. Some inversions exhibited similar geographical patterns and ecological 
requirements among the four mosquito species, providing evidence for parallel evolution. 
Conversely, inversion polymorphisms that were considered to be common between sibling 
species displayed divergent ecological patterns, suggesting that they might have a different 
adaptive role in each species. These results are in agreement with the finding that 
chromosomal inversions play a role in Anopheles ecotypic adaptation, by making them 
more easily adaptable and, therefore, more ubiquitous and potent malaria vectors. 

Materials and Methods 

Chromosome inversion frequency data 
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146 The full karyotypes of 35,618 mosquitoes were used in this study, among which 16% of 

147 specimens represented unpublished data (Table 1, Table S1-S4). In total, 23 polymorphic 

148 inversions in An. gambiae, An. coluzzii, An. arabiensis and An. funestus were reviewed and 

149 analysed (see Table 1). We characterized the inversions in each species, even if they were 

150 considered as common, to avoid potential biases in the analysis, Mosquito sampling was 

151 carried out in 27 countries in Africa and Saudi Arabia and included the dry savannahs of 

152 Senegal and South Africa, the Highlands of Kenya and Madagascar, and the rainforests of 

153 Cameroon and the Ivory Coast (Figure 1). This vast territory is highly heterogeneous, thus 

154 increasing the genetic diversity of mosquito species (Lehmann et al. 2003; Michel et al. 

155 2005). 

156 Mosquitoes were collected from 1979 to 2007. We obtained the frequencies of each 

157 inversion in each sampled locality (n=1617 villages) and for each species (Table 1). 

158 Inversions in Anopheles are arbitrarily classified as standard or inverted. To avoid any 

159 potential bias, we included the frequencies of both inverted and standard forms in our 

160 analyses. The geographical coordinates of the original sampling localities were recorded 

161 using a hand-held GPS receiver. U nk nown geographical coordinates of localities reported 

162 in the literature were validated through the National Geospatial-Intelligence Agency 

163 [http://geonames.nga.mil]. For modelling purposes, we used 5 km x 5 km grid squares as 

164 territorial units on the basis of the mosquito dispersal capabilities (Costantini et al. 1996; 

165 Ayala et al. 2013). Mosquito information was then assigned to the square in which the 

166 sampling village was located. 

167 Karyotyping analysis was identical for the four mosquito species. Briefly, ovaries of half- 

168 gravid females were dissected and stored in Carnoy's fixative solution (three parts 100% 

169 ethanol: one part glacial acetic acid, by volume). They were subsequently prepared 

170 according to standard protocols to obtain the polytene chromosomes (della Torre 1997). 
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Paracentric inversions were identified and scored according to their respective chromosome 
maps (Coluzzi et al. 2002; Sharakhov et al. 2004). Only the most common polymorphic 
inversions were used to ensure sufficient statistical power for the subsequent analyses 
(Table 1). 

Ecogeographical predictors 

On the basis of their potential predictive power and according to previous studies on 
mosquitoes (Ayala et al. 2009; Costantini et al. 2009; Simard et al. 2009; Sinka et al. 2010), 
we selected 1 1 ecogeographical predictors as potential drivers of the distribution patterns of 
each chromosomal inversion. Predictors fell within three categories: spatial (four 
predictors), land (four predictors) and climate (three predictors). All information was 
transferred to the territorial units by using zonal statistics tools. 

Spatial variables. Spatial predictors were investigated to uncover geographical trends in 
the distributions associated with historical events, or species - inversions - population 
dynamics (Real et al. 2003). We investigated four spatial predictors: latitude, longitude, the 
product of latitude and longitude, and distance to the equator (measured as the absolute 
value of latitude) of each territorial unit, to account for the data spatial structure (Legendre 
1998; Kennington et al. 2006). 

Land variables. The importance of land use in explaining insect distribution patterns is 
well known (Acevedo et al. 2010; Hortal et al. 2010), including for Anopheles species 
(Manoukis et al. 2008; Simard et al. 2009). Here, we considered the Normalized Difference 
Vegetation Index (NDVI) and its seasonal variations (Acevedo et al. 2010). NDVI is a 
measure of the amount and vigour of vegetation on the land surface directly related to soil 
moisture, and has been successfully used to highlight changes in land cover (Nicholson et 
al. 1990; Nicholson and Farrar 1994). The NDVIs were derived from a monthly imaging 
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196 dataset (http ://modis . gsfc .nasa. gov/ data/ dataprod/mod 1 3 .php ) over a 13 -year period, from 

197 1998 to 2010, at a spatial resolution of ~1 km. Four different NDVI-derived variables were 

198 quantified on the basis of their importance in mosquito phenology (Clements 1999): yearly 

199 mean, yearly variation, wettest quarter mean (May to October), wettest quarter variation 

200 (May to October). Variations were quantified as the variation coefficients of 

201 yearly/quarterly means in the 13-year period. We selected the wettest quarter on the basis 

202 of mosquito population dynamics (Molineaux and Gramiccia 1980) and because 

203 mosquitoes are mainly captured during this season (Lindsay et al. 1998). 

204 Climate variables. Three predictors were selected from the range of topoclimatic 

205 predictors associated with inversion distributions in Diptera: temperature, rainfall and 

206 elevation (Petrarca et al. 2000; Hoffmann et al. 2004; Ayala et al. 2011). In our study, the 

207 mean temperature and rainfall were quantified for the wettest quarter of the year, because 

208 this is the most important period for mosquito population dynamics (Moffett et al. 2007). 

209 Data on bioclimatic variables and altitude are available from the Worldclim project 

210 database (see (Hijmans et al. 2005) for details) at a spatial resolution of ~5 km. 

211 

212 Statistical Analysis 

213 Ecogeographical models. Using an inductive approach we determined the macro- 

214 ecological requirements of the chromosomal inversions at the locations where they 

215 occurred (Corsi et al. 2000). We related the frequency of each inversion to the predictors 

216 using generalized linear models (GLM) with binomial distribution (number of inversions 

217 relative to the number of sampled mosquitoes per sampling locality, see below) and a 

218 logistic link function (Hosmer et al. 1989). To obtain the most parsimonious model for each 

219 polymorphic rearrangement we used a forward-backward stepwise model-selection 

220 procedure. All steps were assessed to decrease the Akaike Information Criterion (AIC) 
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(Akaike 1974). Models were built on a subset of randomly selected sampling localities for 
each inversion (80%) and then projected to the whole study area. The remaining sampling 
localities (20%) were used as independent data to evaluate the predictive performance of 
the models. To this aim, calibration plots (Pearce and Ferrier 2000) and Pearson’s 
correlations were employed to statistically determine the relationship between the predicted 
probabilities and the observed frequencies (Zheng and Agresti 2000). Bins with n<15 
specimens were not included in this evaluation, because this is the minimum sample size 
required to estimate a frequency with acceptable accuracy (Jovani and Telia 2006). We 
restricted the ecogeographical models to West-Central Africa to avoid the inclusion of 
sparsely sampled areas in eastern and southern Africa (Figure 1) where the uncertainty of 
the model predictions could not be properly assessed (Heikkinen et al. 2012). Finally, 
potential confounding effects between spatial and environmental factors can be expected in 
models of species spatial distribution (Donnann 2007). To rule out this modelling bias, we 
investigated spatial autocorrelation in the residuals by estimating the Moran’s I index (a 
measure of global spatial autocorrelation) for each model (Table S6). 

Hierarchical clustering analysis. Clustering methods are a powerful tool for classifying 
similar objects in groups and are particularly useful for selecting species with similar 
biogeographical patterns (Kreft and Jetz 2010; Olivero et al. 2011) or with 
genetic/molecular similarities (Heard et al. 2005). We investigated similarities among the 
ecogeographical patterns of the polymorphic inversions in West-Central Africa. To carry 
out the hierarchical clustering analysis we randomly selected 1666 evaluation points in 
accordance with our sampling design (Legendre 1998), each of which was attributed with 
the probability of occurrence of each inversion according to the models’ predictions. 
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245 To avoid mismatching due to inversion arrangements (standard versus inverted), we used 

246 the absolute value of the Pearson’s correlation coefficient to estimate pairwise distances 

247 between the predicted probability of occurrence of each inversion. This index indicates 

248 similarity in shape between two (or more) profiles, and fits perfectly with the aim of 

249 clustering together inversions that respond similarly to ecogeographical gradients and 

250 species ranges. To maximize similarity within groups, we used the unweighted pair-group 

25 1 method with arithmetic averages (UPGMA) as agglomeration method. This produces less 

252 distortion from the original similarities than complete or single linkages and is consistently 

253 the best performing clustering algorithm for biogeographical classifications (Kreft and Jetz 

254 2010). We then searched for chorotypes, defined as groups of inversions the probabilities of 

255 occurrence of which are similarly distributed within the group and/or dissimilarly in 

256 relation to other inversions. Chorotype detection was performed as described by Olivero et 

257 al. (Olivero et al. 2011) and significance among dendrogram branches was evaluated with a 

258 G-test of independence (Sokal and Rohlf 1981). 

259 

260 Canonical Ordination. The canonical correspondence analysis (CCA) is a statistical 

261 method for ordering species along canonical axes according to their (ecological) optima 

262 (Ter Braak 1987). We used this technique to relate the probability of inversion occurrence 

263 (on the evaluation points, Figure S2) with the set of ecological predictors (in this case, 

264 excluding spatial predictors to account only for ecological similarity). This allows for fairly 

265 easy ecological interpretation of inversion assemblages (Ayala et al. 2011). The statistical 

266 significance of the canonical axes and environmental predictors was assessed with 

267 permutation tests (10,000 times). To improve the CCA ecological comprehension, we 

268 plotted the major habitat boundaries according to (Olson et al. 2001). To determine the 

269 number of components to interpret from our CCA, we used the broken stick model (based 
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on eigenvalues from random data) and the Kaiser-Guttman rule (based on the average 
value of eigenvalues) procedures (Jackson 1993). 

All computational statistics were performed using the IBM-SPSS 19.0 software (IBM 
Corporation, New York, USA) and “R” v3.0. 1 (R Development Core Team, http://cran.r- 
proiect.org/ ), with the addition of the “amap” (Lucas 2011), “vegan” (Oksanen et al. 2011), 
and ggvegan (https://github.com/gavinsimpson/ggvegan) libraries. 


Results 

Data on the inversions and Anopheles species retrieved from the literature and from our 
unpublished work and used for this study are summarized in Table 1, Figure 1 and 
Supplementary Material (Tables S1-S4). 

Biogeographical patterns of chromosomal inversions 

Using a generalized linear model approach, the frequency of each inversion was correlated 
with several ecogeographical predictors to calculate the inversion ecogeographical 
favourability throughout the study area (West-Central Africa, highlighted in Figure 1). 
Table 2 summarizes the predictors that were found to drive inversion frequency variations 
and the explained deviance (goodness-of-fit) for each model (see also Table S5). Despite 
the variability between models, latitude and precipitation were repeatedly the most 
important predictors of the inversion distribution patterns in the four mosquito species. 
Conversely, distance to the equator was the least significant variable in the final models. 
Overall, the models’ predictive performance was high. Indeed, predictions were highly 
correlated with the observed values in the datasets used for the independent validations 
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295 (Table S5 and Figure SI). However in some cases (for instance, 2Rt in An. funestus), model 

296 performance was adequate according to the calibration plots, but could not be statistically 

297 evaluated due to insufficient independent data. Moreover, the inversion models did not 

298 exhibit a significant spatial autocorrelation bias as indicated by the Moran’s I value close to 

299 zero. This means that the residuals of the models for each inversion show a random spatial 

300 pattern (Table S6). 

301 

302 Next, the statistical models were represented in geographical space to obtain cartographic 

303 models of the expected probability of occurrence of a given inversion, on the basis of the 

304 local ecogeographical conditions (Figure 2). The predicted frequencies were subject to the 

305 local occurrence of each species and, accordingly, these maps should be interpreted as the 

306 predicted inversion frequency of the target species in a given locality. 

307 

308 Hierarchical clustering analysis 

309 The predicted inversion frequencies throughout the study area were then correlated with 

310 each other to find similar eco-geographical patterns. Inversions that showed a significantly 

311 similar environmental distribution were grouped in chorotypes. We identified three 

3 12 significant inversion chorotypes in the study area (Figure 3). The first chorotype contained 

313 the 2Rd inversion in An. gambiae and An. coluzzii; the second contained the inversions 2Ru 

314 in An. coluzzii and 2Rc in An. arabiensis; the third comprised the inversions 2La, 2Rb and 

315 2Rc in An. gambiae and An. coluzzii, and 2Rt, 3Ra, 3Rb and 3La in An. funestus. 

316 Altogether, the dendrogram highlighted important features concerning the inversion spatial 

317 patterns. Overall, chromosome inversions in An. gambiae and An. coluzzii showed a 

318 significant common spatial pattern. Moreover, chromosome inversions in An. funestus, An. 

319 gambiae and An. coluzzii were associated with similar environmental clines, revealing the 
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presence of significantly correlated eco-geographical patterns. Finally, some inversions 
exhibited a very specific spatial pattern, such as 2Rj in An. gambiae. 

Environmental drivers of inversions 

The maximum correlations between the set of land and climate predictors (spatial 
predictors were excluded to account only for ecological similarity) and the inversion 
frequencies were then determined using a canonical correspondence analysis (CCA) 
method. This statistical approach allowed us to define the ecological optimum of each 
inversion and to represent the overall contribution of the tested ecological predictors to the 
chromosomal polymorphism frequency among species (Figure 4). The seven CCA axes 
were all statistically significant (ANOVA, p< 0.001). In accordance with the broken stick 
model and the Kaiser-Guttman rule, only the first two CCA axes, which accounted for 
almost 80% of the total variance in the inversion dataset, were retained (Jackson 1993). The 
first axis explained 65.61% (ANOVA, F=1219, p<0.001) and the second 13.67% of the 
total variance (ANOVA F = 254, p<0.001). The other five CCA axes represent 10.3%, 
6.1%, 2.6, 0.9% and 0.7%, respectively. The significance of all tested ecological predictors 
was assessed by permutation tests (ANOVA, p<0.001). The first axis could be ascribed to 
an aridity gradient (Figure S2A) and structured chromosomal inversions according to their 
tolerance to aridity, relative to the habitat boundaries. The second CCA axis represented an 
environmental gradient mostly influenced by vegetation productivity patterns and elevation 
(Figure S2B). Here, inversions with an ecological optimum in the savanna biome showed 
the highest frequency variations. 

Discussion 
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344 This study elucidates the ecological basis of the spatially structured distribution of 23 

345 chromosomal inversions in the four major malaria vectors in Africa. Three major outcomes 

346 emerge: i) each chromosome inversion shows a specific and unique eco-geographical 

347 pattern, possibly explaining how mosquito species can extend their habitat ranges; ii) some 

348 of the inversion polymorphisms in the four mosquito species exhibit common 

349 ecogeographical patterns, suggesting that they are involved in the adaption to similar local 

350 pressures; and iii) some inversion polymorphisms, presumably shared by sibling species, 

35 1 exhibit contrasting ecological patterns, suggesting a different adaptive role. Altogether, our 

352 results establish a strong ecological basis for future genomic studies to elucidate the genetic 

353 bases of inversion contribution to local adaptation in these four malaria vectors. 

354 

355 The role of inversions in environmental adaptation 

356 Chromosome inversions have often been considered as major drivers in the geographical 

357 expansion of many species (Krimbas and Powell 1992), including An. gambiae, An. 

358 coluzzii, An. arabiensis and An. funestus . Our study shows that the frequency of most of 

359 the studied inversions could be explained by the tested ecogeographical gradients (Table 2). 

360 Only for two inversions (2Rd and 2Ru in An. coluzzii), the models could not explain their 

361 ecogeographical variation (only 17% and 26%, respectively). On the other hand, each main 

362 ecogeographical gradient had a different weight on each inversion (Table 2, Figure 2). The 

363 CCA identified the major environmental variables that influenced the distribution of each 

364 inversion and provided evidence that chromosomal rearrangements are linked to different 

365 environmental gradients (Figure 4). To date, chromosomal rearrangements have been 

366 exclusively linked to specific ecogeographical predictors, such as latitude (Hoffmann and 

367 Rieseberg 2008), but our analyses provide a broader picture of the habitats where these 

368 inversions play an important role for local adaptation. For instance, inversion 2Rd in An. 
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gambiae and An. coluzzii plays a robust role in the adaptation to elevation. It is not the first 
time that an inversion has been associated with altitudinal clines (Collinge et al. 2006). 
Unfortunately, previous studies on the risk of malaria transmission at high altitude in Africa 
did not characterize the chromosome polymorphisms of the adapted populations 
(Tchuinkam et al. 2010). However, the major ecological challenge for these four mosquito 
species is the adaptation to humid conditions. Many of the inversions in An. coluzzii, An. 
gambiae and An. funestus are thought to have a savanna origin (Ayala and Coluzzi 2005; 
Ayala et al. 2009; Kamali et al. 2014) and have been linked to local acclimatization to 
rainforest habitat (Figure 4, Figure 3 - chorotype 3). At least for An. coluzzii and An. 
gambiae, breakpoints analyses have confirmed that species with the inverted 2La karyotype 
(arid savanna) predate species with the standard 2La karyotype (forest) (Sharakhov et al. 
2006; Fontaine et al. 2015). This is in agreement with the crucial role played by these 
rearrangements in malaria expansion to rainforest habitats (Annan et al. 2007). Similarly, 
inversion 3Ra in An. arabiensis appears to be associated with the adaption to the mosaic 
forest-savannah biome. 

Despite the many evidences, we cannot conclude that all the observed inversion distribution 
patterns have an exclusive ecological basis. Indeed, in Anopheles, like in other organisms, 
chromosomal inversions have been correlated also with non-environmental traits 
(Hoffmann and Rieseberg 2008), such as resting behavior (Coluzzi et al. 1977; Bryan et al. 
1987; Costantini et al. 1999), host preference (Coluzzi et al. 1979b; Petrarca and Beier 
1992), insecticide resistance (Brooke et al. 2002) and Plasmodium infection (Petrarca and 
Beier 1992). Therefore, although ecological forces seem to be the major drivers of 
inversion distribution (Table 2), other behavioural and/or physiological traits could have a 
non-negligible effect on their frequency patterns. Another plausible explanation is that 
demographic forces are responsible of the clinal patterns observed. However, several 
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394 factors lead to think that clinal variation is related to the action of natural forces (Endler 

395 1977). Firstly, many inversions exhibit parallel dines across the continent (Table S1-S4). 

396 For instance, in An. gambiae, Simard et al., (Simard et al. 2009) and Coluzzi et al., 

397 (Coluzzi et al. 1979a) found identical clinal patterns in Cameroon and Nigeria, respectively. 

398 Secondly, another important factor is migration. Strong gene flow between populations 

399 would quickly make dissapear any inversion gradient in absence of selection. According to 

400 neutral genetic markers, An. gambiae/ An. coluzzii and An. funestus exhibit parallel genetic 

401 structure across Africa coherent with a common expansion (Lehmann et al. 2003; Michel et 

402 al. 2005). Both continental studies highlight the strong gene flow between natural 

403 populations of these mosquitoes, denoting weak population structure. On the other hand, 

404 scattered countrywide studies in An. arabiensis revealed as well important gene flow 

405 between populations of this vector (Donnelly et al. 1999; Simard et al. 2000). Finally, the 

406 large effective population size in all these mosquitoes reinforce the assumptions of one 

407 panmictic population through the continent with limited gene flow barriers (Donnelly et al. 

408 1999; Lehmann et al. 2003; Michel et al. 2005). Altogheter, these evidences are the 

409 strongest indications that environmental selection forces, and not demographic forces, are 

410 responsible for the observed environmental gradients of most of our inversions. 

411 

412 Same patterns, same causes: parallel chromosome evolution in Anopheles. 

413 The hierarchical clustering analysis showed that chorotype 3 included inversions from three 

414 different species: An. gambiae. An. coluzzii and An. funestus. This cluster can be interpreted 

415 as a biogeographical pattern with strong internal similarity. Moreover, the CCA confirmed 

416 that these inversions have similar ecological gradients (Figure 4). These three species live 

417 in sympatry throughout much of their geographical range in Africa (Gillies and Coetzee 

418 1987). In An. gambiae and An. coluzzii, the origin of the inversions 2La, 2Rb and 2Rc 
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predates their divergence (White et al. 2009a; Lawniczak et al. 2010; Fontaine et al. 2015); 
therefore, we could expect similar patterns among their shared ancestral polymorphisms. 

On the other hand, An. funestus and An. gambiae diverged ~35 Mya (Krzywinski and 
Besansky 2003; Neafsey et al. 2015). Therefore, any common spatial pattern could be 
attributed to similar environmental pressures (Ayala et al. 2009; Sinka et al. 2010). One 
hypothesis is that these inversions captured similar sets of genes. Several authors found 
nearly perfect synteny preservation between arms when analysing whole chromosome arms 
(Sharakhov et al. 2002; Neafsey et al. 2015). This means that if inversions captured similar 
sets of genes, they should be on homologous arms. In a key paper, Sharakhova et al. 
(Sharakhova et al. 2011) investigated the non-random distribution of genes along 
homologous arms of malaria vectors. They found significant non-random gene 
combinations on An. gambiae 2Rb and on An. funestus 2Rh and 2Rd. Here, we found that 
inversions 2Rb and 2Rh are both strongly associated with rainforest habitat (Figure 2 and 
Figure 4). Unfortunately, the 2Rd inversion frequencies in An. funestus were too low and 
uneven to be included in the analysis. However, this rearrangement has been consistently 
associated with forest areas in Cameroon (Cohuet et al. 2005). Therefore, environmental 
adaptation to very humid conditions could have preserved specific gene combinations 
within these three inversions in the two species. Moreover, Sharakhova et al. (Sharakhova 
et al. 2011) found an almost significant (p = 0.07) association between 2La in A gambiae 
and 3Rb in An. funestus. The very similar patterns of the main environmental gradients for 
2La and 3Rb in the study area (Figure 2 and Figure 4) indicate their common role in 
adaptation to an aridity cline. However, these two inversions, together with 3Ra in An. 
funestus, show limited co-linearity, leaving strong doubts that they captured comparable 
large blocks of genes (Sharakhov et al. 2002). Nevertheless, we cannot exclude the 
possibility that smaller segments and/or few genes are shared within the inversions 2La and 
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444 3Ra/3Rb. Theoretically, the inversion would only need to capture one locally adapted locus 

445 to increase its frequency (Kirkpatrick and Barton 2006). The availability of the An. 

446 gambiae, An. coluzzii and An. funestus genomes will make possible to identify genes 

447 common to these species (Neafsey et al. 2015). In conclusion, despite the fast evolution of 

448 autosomal arms and gene shuffling, natural selection may preserve specific gene 

449 combinations within polymorphic inversions, particularly in distant species, such as An. 

450 gambiae-An. coluzzii and An. funestus , that are subject to similar environmental pressures. 

45 1 Thus, the most plausible scenario is that inversions in these species occurred independently 

452 (i.e., parallel evolution) and captured locally adapted genes in homologous arms. This 

453 knowledge may be very useful for identifying the gene signatures of local adaptation in 

454 these malaria vectors (Neafsey et al. 2015). 

455 

456 Common, but ecologically divergent inversions 

457 Our analyses revealed some degree of ecogeographical divergence among hypothetically 

458 common inversions in species of the An. gambiae complex. Although Anopheles 

459 arabiensis, An. gambiae and An. coluzzii are characterized by different fixed chromosomal 

460 rearrangements on the X chromosome, they potentially share three chromosome inversions: 

461 2La, which is fixed in An. arabiensis and polymorphic in An. gambiae and An. coluzzii, as 

462 well as 2Rb and 2Rc, which are polymorphic in all three species (Table 1, (Coluzzi et al. 

463 2002)). Our predictive models, dendrogram and CCA revealed very different 

464 ecogeographical distribution patterns for 2Rb and 2Rc in the three species (Figures. 2, 3, 

465 and 4; Table 2). The most plausible explanation is that the ecological and local adaptation 

466 patterns (and thus, captured genes) of these inversions are not the same or have rapidly 

467 evolved after speciation among mosquito species. Originally, they were characterized by 

468 the presence of a banding pattern (Coluzzi et al. 2002) and therefore, a potential mismatch 
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in breakpoint recognitions is highly possible. Moreover, there is extensive evidence for 
breakpoint reuse and inversion recycling in other Diptera (Ranz et al. 2007). Indeed, 
differences in gene expression patterns might be expected if breakpoints are different 
between species (Puig et al. 2004). No molecular infonnation is available for 2Rc. On the 
other hand, 2Rb breakpoints have been molecularly characterized in recent years. Lobo et 
al. (Lobo et al. 2010) argued that the homozygous 2Rb inverted form has a single common 
origin in all three sibling species, while the homozygous standard form (2Rb+) may have 
arisen twice through breakpoint reuse. In our models, the inversion 2Rb+ shows contrasting 
ecological patterns in An. gambiae-An. coluzzii and An. arabiensis. On the basis of 
molecular data, White et al. (White et al. 2009b) hypothesized that introgression from An. 
arabiensis brought the 2Rb arrangement into An. gambiae, while introgression in the 
opposite direction introduced 2Rb+ into An. arabiensis. The occurrence of the last 
introgression was supported by Fontaine et al., (Fontaine et al. 2015), who confirmed the 
ancestral status of the standard 2Rb+ form. Recent statistical models have reinforced the 
hypothesis that inversions can act as “ cassettes of genes that can accelerate adaptation by 
crossing species boundaries ” (Kirkpatrick and Barrett 2015). On the basis of our results, 
we hypothesize that a new 2Rb+ inversion might have appeared through breakpoint reuse 
in An. gambiae after speciation of An. arabiensis (Lobo et al. 2010). This new 2Rb+ could 
have brought new mutations that favoured (together with other inversions, see above) the 
colonization of rainforest habitats by An. gambiae/An. coluzzii (Sharakhov et al. 2006). 

New molecular and phenotypic analyses of the inversion 2Rb+ across species and along 
their geographical range might confirm this last hypothesis. 

Conclusions and implications for malaria control 
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493 To date, vector control is the main approach for reducing malaria transmission (Enayati and 

494 Hemingway 2010). However, environmental and behavioural diversity within vector 

495 populations constitute serious challenges to the efficacy of any malaria control strategy 

496 (Ferguson et al. 2010). Much efforts is now required to understand the adaptive 

497 mechanisms of vector species for improving such control strategies (Boete 2005; 

498 Windbichler et al. 2011). 

499 Our study reveals the potential roles played by inversion polymorphisms in the ecological 

500 success of the four major malaria mosquitoes. The importance of chromosome inversions in 

501 adaptation is attested by the strong, significant correlations between their frequencies and 

502 ecogeographical predictors, and by the strong, spatially-structured patterns identified in the 

503 study area. Therefore, inversion polymorphisms may have enabled considerable 

504 geographical expansion of these mosquitoes, with major consequences for malaria parasite 

505 transmission. Moreover, the extensive reshuffling of gene orders confirms, to some extent, 

506 that this type of chromosomal rearrangement is very common and a frequent local 

507 adaptation mechanism in Anopheles (Kamali et al. 2012). On the other hand, the 

508 converging evolution of inversions in An. gambiae, An. coluzzii and An. funestus may 

509 provide a suitable basis for comparative studies to identify the genes responsible for 

510 environmental adaptation. Understanding the genetic mechanisms that enable these vectors 

511 to extend their geographical range will have a profound impact on malaria epidemiology. 

5 12 The complete genome of An. funestus (Besansky 2008; Neafsey et al. 2015) will now 

513 provide considerable opportunities for comparative genomic studies with An. gambiae and 

514 will help elucidating the common mechanisms involved in ecological adaptation. Finally, 

515 besides the adaptive role of inversions, the recognition of ecological divergence within 

516 populations of insect vectors has a direct impact on the efficacy of any vector-borne disease 

517 control strategy (Ferguson et al. 2010). Ecological and behavioural diversification within 
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species of the An. gambiae complex have expanded malaria transmission spatially and 
temporally, compromising the efficacy of malaria control efforts (Molineaux and 
Gramiccia 1980). Surveillance of genetic and ecological divergence within vector 
populations will ultimately lead to more effective malaria vector control interventions. 
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891 Tables. 

892 Table 1. Summary of the chromosome inversions in the four malaria vectors Anopheles 

893 gamhiae, An. coluzzii. An. arabiensis and An.funestus. 

894 

895 Table 2. Summary of the models for each polymorphic inversion 

896 Ecogeographical predictors selected in the final models for each inversion and Anopheles 

897 species. Species are coded by colour: An. gambiae (yellow), An. coluzzii (red), An. 

898 arabiensis (blue) and An.funestus (grey). Solid cells represent significant variables for each 

899 model and open cells non-significant variables. The percentage of inversion frequency 

900 variation explained by each predictor in each species is indicated in the last row of each 

901 species. The last column represents the total explained deviance of the final model for each 

902 inversion. Lat: latitude; Long: longitude; Lat x Long: the product of latitude and longitude; 

903 abs(Lat): distance to equator expressed as absolute latitude value; Temp: mean temperature 

904 of the wettest quarter of the year; Precip: mean precipitation of the wettest quarter of the 

905 year; NDVI (normalized difference vegetation index) is numerical indicator that uses 

906 remote sensing measurements of live green vegetation (NDVI yearly mean, yearly 

907 variation, quarterly mean and variation and wettest quarter mean and variation for the 

908 period included in the study). 

909 
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Table 1 



Specimens 

Villages 

Inversions 

Inversions names 

An. gambiae 

7949 

799 

6 

2 La; 2Rb, 2Rc, 2Rd, 2Ru, 2Rj 

An. coluzzii 

5000 

528 

5 

2La; 2Rb, 2Rc, 2Rd, 2Ru 

An. arabiensis 

12836 

125 

5 

2Ra, 2Rb, 2Rc, 2Rdl; 3Ra 

An. funestus 

9833 

165 

7 

2Ra, 2Rh, 2Rab, 2Rt; 3Ra, 3Rb; 3La 


Table 2. 


Inversions Lat Long LatxLong abs(Lat) Elevation Temp Precip NDVI year mean NDVI year variation NDVI quarter mean NDVI quarter variation Total 

2La 
2Rb 
2Rc 
2Rd 
2Ru 
2Rj 

Total An. gambiae 
2La 
2Rb 
2Rc 
2Rd 
2Ru 

Total An. coluzzii 
2Ra 
2Rb 
2Rc 
2Rdl 
3Ra 


Total An. arabiensis 80% 80% 80% 60% 60% 40% 60% 20% 40% 20% 60% 



Total An, fimestus 100% 86% 71% 57% 71% 86% 100% 86% 71%, 71% 100% 


Total 96% 87% 83% 30% 70% 52% 87% 65% 61% 57% 78% 



65% 

90% 

86% 

55% 

89% 
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923 Figures. 

924 Figure 1. Sampling villages 

925 Map of the main African habitat types (Olson et al. 2001) showing the distribution of the 

926 villages where the four Anopheles species were sampled: Anopheles gambiae (yellow dots), 

927 An. coluzzii (red dots), An. arabiensis (blue dots) and An.funestus (grey dots). A detailed 

928 representation of the West-Central Africa area that was selected to plot the predicted maps 

929 and carry out the similarity analyses is shown. 

930 

93 1 Figure 2. Maps of the predicted inversion frequencies for Anopheles gambiae. An. coluzzii, 

932 An. arabiensis and An. funestus. Predicted inversion distributions were passively plotted in 

933 the West-Central Africa study area. Blue represents a probability of 100% for the standard 

934 inversion, red represents a probability of 100% for the inverted inversion form, according 

935 to the literature data for each species (Green and Hunt 1980; Coluzzi et al. 2002). To 

936 improve their representativity, probabilities were reclassified in four classes: 0.00-0.25; 

937 0.25-0.50; 0.50-0.75; 0.75-1.0. 

938 

939 Figure 3. Dendrogram of the predicted inversion frequency distribution in West-Central 

940 Africa showing similar environmental patterns in Anopheles species. 

941 Inversions included in the three chorotype clusters (Ch.l, Ch.2 and Ch.3) are enclosed in 

942 squares. Anopheles spp. are coded by letters and colours: g (yellow): An. gambiae ; c (red): 

943 An. coluzzii-, a (blue): An. arabiensis; f (grey): An.funestus. 

944 

945 Figure 4. Canonical correspondence analysis of the inversion ecological distribution 

946 throughout West-Central Africa to highlight local adaptation patterns among chromosome 

947 inversions and mosquito species. 
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CCA diagram showing the ordination of the chromosomal inversions (standard and 
inverted; red crosses indicate their ecological optima) for each species along the first two 
canonical axes (CCA1 and 2) that, together, explain ~80% of variance. Anopheles spp. are 
coded by letters and colours: g (yellow): An. gambiae; c (red): An. coluzzii; a (blue): An. 
arabiensis; f (grey): An. funestus. Asterisks represent the standard form of each inversion. 
Ecological predictors are passively plotted on the graph: elevation, temperature (mean 
temperature of the wettest quarter of the year), precipitations (mean precipitations of the 
wettest quarter of the year) and NDVI variables (yearly mean, yearly variation, quarterly 
variation and wettest quarter variation for the period included in the study). 
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959 Supporting Information 

960 Tables S1-S4. Chromosomal inversion data used in this study. 

961 Table SI: Anopheles gambiae ; Table S2: Anopheles coluzzii; Table S4: Anopheles 

962 arabiensis; Table S3: Anopheles funestus. 

963 Lat: latitude decimal degrees; Long: longitude decimal degrees; Lat+X°: latitude + X° to 

964 obtain all positive values; Long+ X°: longitude + X° to obtain all positive values; Lat+ 

965 X°*Long+X°: the product of latitude (lat X°) and longitude (long X°); abs(Lat): absolute 

966 latitude value; Elevation (in m); Mean temperature of the wettest quarter of the year (in °C); 

967 Mean precipitation of the wettest quarter of the year (in mm); NDVI yearly mean and 

968 yearly variation for the period included in the study; NDVI quarterly mean: NDVI wettest 

969 quarter mean for the period included in the study; NDVI quarterly variation: NDVI wettest 

970 quarter variation for the period included in the study; Sample size: Number of mosquitoes 

97 1 karyotyped in the study (in An. arabiensis, missing numbers for some villages were 

972 arbitrarily replaced by 20); Inv_XX: inversion frequency. 

973 
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Table S5. Complete inversion models. 



Model 

Cte 

Lat 

Long 

LatxLong 

abs(Lat) 

Predictor variables 
Elevation Temp 

Precip 

NDVI year 

NDVI year 
variation 

NDVI quarter 

NDVI quarter 
variation 

Performance 
%D R (n) 


2La 

-43.795 

-0.581 

-1.852 

0.043 


0.001 


-0.004 

-1.309 


1.331 

0.03 

88.23 

0.99 (8) 


2Rb 

-52.402 

-0.696 

-2 

0.048 


5.00E-04 


-0.002 

-0.798 

0.211 

0.831 


81.18 

0.97 (8) 


2Rc 

-28.007 

1.284 

0.859 

-0.015 


0.005 

0.092 

0.009 


-0.184 


-0.039 

46.51 

0.98 (7) 

n. gum iae 

2Rd 

-23.316 

0.622 

0.238 



0.001 


0.1E" 20 * 

0.106* 


-0.038* 

0.213 

71.94 

0.90 (6) 


2Ru 

12.74 

0.423 

0.621 

-0.019 




0.7E 16 * 

-1.036 

-0.026* 

1.048 


49 

0.03 (5) ns 


2Rj 

-145.609 

0.629 

-1.277 

0.025 


0.023 

0.305 

0.01 

10.351 

-0.53 

-10.279 

0.135 

69.79 

0.99 (4) 


2La 

21.811 

1.142 


0.003 


-0.005 

-0.11 

-0.003 


-0.206 


-0.079 

86.71 

# (2) 


2Rb 

6.693 

1.082 

0.828 

-0.015 





0.006 



-0.077 

56.16 

0.98 (5) 

An. coluzzii 

2Rc 

4.563 

1.254 

0.899 

-0.018 


0.001* 

0.004* 

0.002 


0.014* 


-0.06 

63.11 

0.99 (7) 


2Rd 

-12.445 

0.291 

0.107 



0.001 


0.2E' 16 * 

0.148 


-0.112* 

0.056 

16.76 

#(1) 


2Ru 

-31.012 

-0.789 

-1.536 

0.031 




-0.001* 

-1.213* 

0.007* 

1.222* 


26.42 

#(3) 


2Ra 

7.014 

-0.043 

-0.101 

0.001 

0.198 


-0.042 

0.002 

-0.005 

0.168 


-0.157 

60.33 

0.97 (8) 


2Rh 

-7.412 

-0.724 

-0.539 

0.014 


-0.004 

-0.079 

0.01 

-4.034 

0.454 

4.058 

0.155 

79.64 

0.90 (5) 


2Rab 

-51.363 

1.33 

0.275 





0.014 

-3.934 


4.061 

0.149 

85.47 

# (2) 

An. funestus 

2Rt 

48.472 

0.270* 


-0.008 


-0.008 

-0.243 

-0.003 

0.142 



-0.197 

88.78 

#(3) 


3Ra 

-19.248 

-0.262 

-0.201 

0.004 

0.035 

0.001 

0.0376 

-0.006 

-4.469 

0.12 

4.486 

-0.061 

47.11 

0.93 (7) 


3Rb 

3.442 

-0.642 

-0.403 

0.008 

0.048 

-0.002 

-0.045 

0.003 

-2.558 

0.057 

2.572 

0.037 

70.45 

0.64 (7) ns 


3La 

14.947 

-0.296 

-0.094 


-0.144 

-0.003 

-0.081 

0.003 


0.095 

0.038 

0.118 

80.15 

0.90 (8) 


2Ra 

1.981 

-0.231 

-0.415 

0.008 


-0.002 

-0.028 

-0.004 




0.019 

64.77 

0.92 (6) 


2Rb 

0.627 

0.064 


-0.001 

0.025 



0.003 

0.006 

0.032 



89.9 

0.78 (8) 

An. arabiensis 

2Rc 

-8.203 


-0.511 

0.009 

-0.397 





-0.039 



85.8 

0.85 (4) ns 


2Rdl 

-3.943 

0.119 

-0.027 



-0.001 






-0.019 

55.3 

# (2) 


3Ra 

2.133 

-0.475 

-0.48 

0.009 

0.107 

0.001 

-0.037 

-0.002 



0.007 

0.03 

88.61 

0.97 (5) 


Statistical parameters (coefficients, p-values and percentages of explained deviance [%D]) 
of the final models selected for each inversion and Anopheles species. The model 
performance was also assessed using independent data (i.e., 20% of localities not used for 
the modelling approach) and Pearson’s correlations (R) to the values represented in the 
calibration plots (see Figure 2) (i.e., the relation between the predicted probability of 
occurrence and the observed proportion of mosquitoes with a given inversion frequency). 
The observed prevalence was estimated only for bins with more than 15 specimens (n) 
following (Jovani and Telia 2006), and Pearson’s R value was not obtained if n<4 (noted as 
#). Variables included in the models were: Lat: latitude; Long: longitude; Lat x Long: the 
product of latitude and longitude; abs(Lat): distance to the equator; Elevation; Temp: mean 
temperature of the wettest quarter of the year; Precip: mean precipitation of the wettest 
quarter of the year; NDVI year mean: NDVI yearly mean for the period included in this 
study; NDVI year variation: NDVI yearly variation for the period included in this study; 
NDVI quarter mean: NDVI wettest quarter mean for the period included in this study; 
NDVI quarter variation: NDVI wettest quarter variation for the period included in this 
study. 
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Table S6. Spatial auto-correlation among the inversion models. 


Species 

Inversion 

Moran's 1 


2 La 

-0.017 


2Rb 

0.053 

An. coluzzii 

2Rc 

0.047 


2Rd 

0.023 


2Ru 

0.033 


Mean 

0.028 (-0.017 - 0.053) 


2 La 

0.043 


2Rb 

0.053 

An. gambiae 

2Rc 

0.071 

2Rd 

0.058 


2Ru 

0.040 


2Rj 

0.026 


Mean 

0.048 (0.026 - 0.071) 


2Ra 

0.034 


2Rb 

0.032 

An. arabiensis 

2Rc 

0.113 


2Rdl 

0.005 


3Ra 

0.010 


Mean 

0.053 (0.005 - 0.113) 


2Ra 

0.127 


2Rh 

0.030 


2Rab 

-0.010 

An. funestus 

2Rt 

-0.001 


3Ra 

0.045 


3Rb 

0.127 


3 La 

-0.027 


Mean 

0.027 (-0.027-0.127) 


The spatial auto-correlation of residuals was measured for each inversion model by 
estimating the Moran’s I index using the R package ‘spdep’ (R Development Core Team, 
http://cran.r-project.org/web/packages/spdep/index.html). Moran’s I varies between -1 to 1. 
The low values (~ 0) indicate random spatial patterns. 
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Figure SI. Calibration plot for each inversion model 

(A): Anopheles gambiae; (B): An. coluzzii ; (C): An. arabiensis; (D): An. funestus. Open 
circles indicate bins with more than 15 specimens. Full circle, bins with less than 15 
specimens. 

Figure S2. Representation of the first two CCA axes for the West-Central Africa study 
area. 

(A) CCA 1. (B) CCA 2. Dots indicate the evaluation points where the probability of 
occurrence of each inversion was predicted by the models. Colour varies in function of the 
CCA value. 
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Figure 1 
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